社交媒体的日益普及引起了人们对儿童在线安全的关注。未成年人与具有掠夺性意图的成年人之间的互动是一个特别严重的关注点。在线性修饰的研究通常依靠领域专家来手动注释对话,从而限制了规模和范围。在这项工作中,我们测试了良好的方法如何检测对话行为并取代专家的人类注释。在在线修饰的心理理论中,我们将$ 6772的$ 6772 $聊天消息标记为儿童性犯罪者以十一种掠夺性行为之一发送的聊天消息。我们训练字袋和自然语言推断模型来对每种行为进行分类,并表明,最佳性能模型以一致但不与人类注释的方式分类的方式对行为进行了分类。
translated by 谷歌翻译
Speech-driven 3D facial animation has been widely explored, with applications in gaming, character animation, virtual reality, and telepresence systems. State-of-the-art methods deform the face topology of the target actor to sync the input audio without considering the identity-specific speaking style and facial idiosyncrasies of the target actor, thus, resulting in unrealistic and inaccurate lip movements. To address this, we present Imitator, a speech-driven facial expression synthesis method, which learns identity-specific details from a short input video and produces novel facial expressions matching the identity-specific speaking style and facial idiosyncrasies of the target actor. Specifically, we train a style-agnostic transformer on a large facial expression dataset which we use as a prior for audio-driven facial expressions. Based on this prior, we optimize for identity-specific speaking style based on a short reference video. To train the prior, we introduce a novel loss function based on detected bilabial consonants to ensure plausible lip closures and consequently improve the realism of the generated expressions. Through detailed experiments and a user study, we show that our approach produces temporally coherent facial expressions from input audio while preserving the speaking style of the target actors.
translated by 谷歌翻译
Artificial Intelligence (AI) has become commonplace to solve routine everyday tasks. Because of the exponential growth in medical imaging data volume and complexity, the workload on radiologists is steadily increasing. We project that the gap between the number of imaging exams and the number of expert radiologist readers required to cover this increase will continue to expand, consequently introducing a demand for AI-based tools that improve the efficiency with which radiologists can comfortably interpret these exams. AI has been shown to improve efficiency in medical-image generation, processing, and interpretation, and a variety of such AI models have been developed across research labs worldwide. However, very few of these, if any, find their way into routine clinical use, a discrepancy that reflects the divide between AI research and successful AI translation. To address the barrier to clinical deployment, we have formed MONAI Consortium, an open-source community which is building standards for AI deployment in healthcare institutions, and developing tools and infrastructure to facilitate their implementation. This report represents several years of weekly discussions and hands-on problem solving experience by groups of industry experts and clinicians in the MONAI Consortium. We identify barriers between AI-model development in research labs and subsequent clinical deployment and propose solutions. Our report provides guidance on processes which take an imaging AI model from development to clinical implementation in a healthcare institution. We discuss various AI integration points in a clinical Radiology workflow. We also present a taxonomy of Radiology AI use-cases. Through this report, we intend to educate the stakeholders in healthcare and AI (AI researchers, radiologists, imaging informaticists, and regulators) about cross-disciplinary challenges and possible solutions.
translated by 谷歌翻译
The de facto standard of dynamic histogram binning for radiomic feature extraction leads to an elevated sensitivity to fluctuations in annotated regions. This may impact the majority of radiomic studies published recently and contribute to issues regarding poor reproducibility of radiomic-based machine learning that has led to significant efforts for data harmonization; however, we believe the issues highlighted here are comparatively neglected, but often remedied by choosing static binning. The field of radiomics has improved through the development of community standards and open-source libraries such as PyRadiomics. But differences in image acquisition, systematic differences between observers' annotations, and preprocessing steps still pose challenges. These can change the distribution of voxels altering extracted features and can be exacerbated with dynamic binning.
translated by 谷歌翻译
每个自动驾驶数据集都有不同的传感器配置,源自不同的地理区域并涵盖各种情况。结果,3D检测器倾向于过度拟合他们的数据集。当在一个数据集上训练检测器并在另一个数据集上进行测试时,这会导致精度急剧下降。我们观察到激光扫描模式差异构成了这种降低性能的很大组成部分。我们通过设计一个新颖的以观看者为中心的表面完成网络(VCN)来完成我们的方法,以在无监督的域适应框架内完成感兴趣的对象表面,从而解决此问题。使用See-VCN,我们获得了跨数据集的对象的统一表示,从而使网络可以专注于学习几何形状,而不是过度拟合扫描模式。通过采用域不变表示,可以将SEE-VCN归类为一种多目标域适应方法,在该方法中无需注释或重新训练才能获得新的扫描模式的3D检测。通过广泛的实验,我们表明我们的方法在多个域适应设置中优于先前的域适应方法。我们的代码和数据可在https://github.com/darrenjkt/see-vcn上找到。
translated by 谷歌翻译
深度神经网络(DNN)模型通常是从​​一层到另一层的依次训练的,这会导致向前,向后和更新锁定的问题,从而导致训练时间的性能差。减轻这些问题的现有并行策略提供了次优的运行时性能。在这项工作中,我们提出了一种新颖的层面分区和合并,向前和向后通过并行框架,以提供更好的训练性能。拟议工作的新颖性包括1)层面分区和合并模型,该模型可以最大程度地降低设备之间的通信开销,而不会在培训过程中没有现有策略的记忆成本; 2)向后通过和向后通过并行化和优化,以解决更新锁定问题并最大程度地减少总培训成本。对实际用例的实验评估表明,所提出的方法在训练速度方面优于最先进的方法。并在不损害非平行方法的准确性性能的情况下实现几乎线性加速。
translated by 谷歌翻译
人类行为越来越多地在移动设备上捕获,从而增加了对自动人类活动识别的兴趣。但是,现有数据集通常由脚本运动组成。我们的长期目标是在自然环境中执行移动活动识别。我们收集一个数据集,以支持与下游任务(例如健康监测和干预)相关的活动类别。由于人类行为中存在巨大的差异,因此我们收集了两个不同年龄段的许多参与者的数据。由于人类行为会随着时间的流逝而改变,因此我们还在一个月的时间内收集参与者的数据以捕捉时间漂移。我们假设移动活动识别可以受益于无监督的域适应算法。为了满足这一需求并检验这一假设,我们分析了整个人和整个时间的域适应性的性能。然后,我们通过对比度学习来增强无监督的域适应性,并在可用标签比例时进行弱监督。该数据集可在https://github.com/wsu-casas/smartwatch-data上找到
translated by 谷歌翻译
传统上,使用漫长的图像处理技术(如FreeSurfer,Cat或civet)解决了磁共振成像的皮质表面重建问题。这些框架需要很长的时间来实时应用不可行,并且对于大规模研究而言是不可行的。最近,已经引入了监督的深度学习方法,以加快这项任务,从而将重建时间从小时到几秒钟。本文将最新的皮质流模型作为蓝图,提出了三个修改,以提高其与现有的表面分析工具的准确性和互操作性,同时又不牺牲其快速推理时间和较低的GPU记忆消耗。首先,我们采用更准确的ODE求解器来减少差异映射近似误差。其次,我们设计了一个例程来产生更平滑的模板网格,避免了由皮质流的基于凸形壳模板中尖锐边缘引起的网格伪像。最后,我们重新铸造表面预测为预测的白色表面的变形,从而导致白色和伴侣表面顶点之间的一对一映射。该映射对于许多现有的表面形态计量学的表面分析工具至关重要。我们将结果方法命名CorticalFlow $^{++} $。使用大规模数据集,我们证明了所提出的更改提供了更高的几何准确性和表面规律性,同时几乎保持了重建时间和GPU记忆要求几乎没有变化。
translated by 谷歌翻译
在具有稀疏奖励的高维,连续空间中的探索是增强学习的一个开放问题。人工好奇心算法通过创造导致探索的奖励来解决这一问题。考虑到能够最大化奖励的强化学习算法,该问题减少了找到与探索一致的优化目标。最大熵探索将国家探访分布的熵作为一个目标。但是,在高维,连续的空间中,有效估计国家探访分布的熵是挑战性的。我们引入了一种基于下限的人工好奇算法,该算法是对国家访问分布的熵的近似值。结合依赖于我们证明使用K均值的任意维度中的非参数密度估计的结果。我们表明,我们的方法既是计算上有效的,又是在高维,连续空间探索的基准上具有竞争力,尤其是在强化学习算法无法找到奖励的任务上。
translated by 谷歌翻译
智能杂草系统为了执行植物特定的运营,可以有助于农业和环境的可持续性。尽管近年来对精密杂草管理的自主机器人技术造成巨大进展,但尚未实现在领域的底盖内的工作。这种系统的先决条件是可靠的检测和杂草的分类,以避免错误地喷涂,从而损坏周围的植物。实时多级杂草鉴定使特异性的杂草治疗能够显着降低除草剂的使用量。在这里,我们的第一个贡献是第一个充分的大型现实图像数据集\ texit {aiweeds}(一个图像中的一个/多种杂草),一个约10,000个亚麻的注释图像,以及在田间和花园中最常见的14个杂草从北达科他州,加利福尼亚州和中国中部的20个不同的地方取自20个不同的地方。其次,我们提供了一个完整的管道,从模型培训,最大效率将规则解优化模型部署到单板计算机上。基于\ Texit {Aiweeds}和管道,我们使用五个基准CNN模型提出了一种分类性能的基线。其中,MobileNetv2具有最短的推理时间和最低记忆消耗,是实时应用程序的合格候选者。最后,我们将MobileNetv2部署到我们自己的紧凑型自主机器人\ Textit {Sambot}以进行实时杂草检测。在亚麻领域的先前看不见的场景中实现了90 \%测试精度(具有0.2-0.3米的行间距,杂草和杂草,失真,模糊和阴影,是真实世界中精确杂草控制的里程碑。我们公开发布了DataSet和代码以生成\ URL {https://github.com/structurescomp/multi-class-weed-classification}。
translated by 谷歌翻译